Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 3.695
Filter
1.
PeerJ ; 12: e17091, 2024.
Article in English | MEDLINE | ID: mdl-38708339

ABSTRACT

Monitoring the diversity and distribution of species in an ecosystem is essential to assess the success of restoration strategies. Implementing biomonitoring methods, which provide a comprehensive assessment of species diversity and mitigate biases in data collection, holds significant importance in biodiversity research. Additionally, ensuring that these methods are cost-efficient and require minimal effort is crucial for effective environmental monitoring. In this study we compare the efficiency of species detection, the cost and the effort of two non-destructive sampling techniques: Baited Remote Underwater Video (BRUV) and environmental DNA (eDNA) metabarcoding to survey marine vertebrate species. Comparisons were conducted along the Sussex coast upon the introduction of the Nearshore Trawling Byelaw. This Byelaw aims to boost the recovery of the dense kelp beds and the associated biodiversity that existed in the 1980s. We show that overall BRUV surveys are more affordable than eDNA, however, eDNA detects almost three times as many species as BRUV. eDNA and BRUV surveys are comparable in terms of effort required for each method, unless eDNA analysis is carried out externally, in which case eDNA requires less effort for the lead researchers. Furthermore, we show that increased eDNA replication yields more informative results on community structure. We found that using both methods in conjunction provides a more complete view of biodiversity, with BRUV data supplementing eDNA monitoring by recording species missed by eDNA and by providing additional environmental and life history metrics. The results from this study will serve as a baseline of the marine vertebrate community in Sussex Bay allowing future biodiversity monitoring research projects to understand community structure as the ecosystem recovers following the removal of trawling fishing pressure. Although this study was regional, the findings presented herein have relevance to marine biodiversity and conservation monitoring programs around the globe.


Subject(s)
Biodiversity , DNA, Environmental , Environmental Monitoring , DNA, Environmental/analysis , DNA, Environmental/genetics , Animals , Environmental Monitoring/methods , Aquatic Organisms/genetics , Video Recording/methods , Ecosystem , DNA Barcoding, Taxonomic/methods
2.
IEEE Trans Image Process ; 33: 3256-3270, 2024.
Article in English | MEDLINE | ID: mdl-38696298

ABSTRACT

Video-based referring expression comprehension is a challenging task that requires locating the referred object in each video frame of a given video. While many existing approaches treat this task as an object-tracking problem, their performance is heavily reliant on the quality of the tracking templates. Furthermore, when there is not enough annotation data to assist in template selection, the tracking may fail. Other approaches are based on object detection, but they often use only one adjacent frame of the key frame for feature learning, which limits their ability to establish the relationship between different frames. In addition, improving the fusion of features from multiple frames and referring expressions to effectively locate the referents remains an open problem. To address these issues, we propose a novel approach called the Multi-Stage Image-Language Cross-Generative Fusion Network (MILCGF-Net), which is based on one-stage object detection. Our approach includes a Frame Dense Feature Aggregation module for dense feature learning of adjacent time sequences. Additionally, we propose an Image-Language Cross-Generative Fusion module as the main body of multi-stage learning to generate cross-modal features by calculating the similarity between video and expression, and then refining and fusing the generated features. To further enhance the cross-modal feature generation capability of our model, we introduce a consistency loss that constrains the image-language similarity and language-image similarity matrices during feature generation. We evaluate our proposed approach on three public datasets and demonstrate its effectiveness through comprehensive experimental results.


Subject(s)
Algorithms , Image Processing, Computer-Assisted , Video Recording , Video Recording/methods , Image Processing, Computer-Assisted/methods , Humans
3.
J Neuroeng Rehabil ; 21(1): 72, 2024 May 03.
Article in English | MEDLINE | ID: mdl-38702705

ABSTRACT

BACKGROUND: Neurodegenerative diseases, such as Parkinson's disease (PD), necessitate frequent clinical visits and monitoring to identify changes in motor symptoms and provide appropriate care. By applying machine learning techniques to video data, automated video analysis has emerged as a promising approach to track and analyze motor symptoms, which could facilitate more timely intervention. However, existing solutions often rely on specialized equipment and recording procedures, which limits their usability in unstructured settings like the home. In this study, we developed a method to detect PD symptoms from unstructured videos of clinical assessments, without the need for specialized equipment or recording procedures. METHODS: Twenty-eight individuals with Parkinson's disease completed a video-recorded motor examination that included the finger-to-nose and hand pronation-supination tasks. Clinical staff provided ground truth scores for the level of Parkinsonian symptoms present. For each video, we used a pre-existing model called PIXIE to measure the location of several joints on the person's body and quantify how they were moving. Features derived from the joint angles and trajectories, designed to be robust to recording angle, were then used to train two types of machine-learning classifiers (random forests and support vector machines) to detect the presence of PD symptoms. RESULTS: The support vector machine trained on the finger-to-nose task had an F1 score of 0.93 while the random forest trained on the same task yielded an F1 score of 0.85. The support vector machine and random forest trained on the hand pronation-supination task had F1 scores of 0.20 and 0.33, respectively. CONCLUSION: These results demonstrate the feasibility of developing video analysis tools to track motor symptoms across variable perspectives. These tools do not work equally well for all tasks, however. This technology has the potential to overcome barriers to access for many individuals with degenerative neurological diseases like PD, providing them with a more convenient and timely method to monitor symptom progression, without requiring a structured video recording procedure. Ultimately, more frequent and objective home assessments of motor function could enable more precise telehealth optimization of interventions to improve clinical outcomes inside and outside of the clinic.


Subject(s)
Machine Learning , Parkinson Disease , Video Recording , Humans , Parkinson Disease/diagnosis , Parkinson Disease/physiopathology , Male , Female , Video Recording/methods , Middle Aged , Aged , Support Vector Machine
4.
Sensors (Basel) ; 24(9)2024 Apr 23.
Article in English | MEDLINE | ID: mdl-38732772

ABSTRACT

In mobile eye-tracking research, the automatic annotation of fixation points is an important yet difficult task, especially in varied and dynamic environments such as outdoor urban landscapes. This complexity is increased by the constant movement and dynamic nature of both the observer and their environment in urban spaces. This paper presents a novel approach that integrates the capabilities of two foundation models, YOLOv8 and Mask2Former, as a pipeline to automatically annotate fixation points without requiring additional training or fine-tuning. Our pipeline leverages YOLO's extensive training on the MS COCO dataset for object detection and Mask2Former's training on the Cityscapes dataset for semantic segmentation. This integration not only streamlines the annotation process but also improves accuracy and consistency, ensuring reliable annotations, even in complex scenes with multiple objects side by side or at different depths. Validation through two experiments showcases its efficiency, achieving 89.05% accuracy in a controlled data collection and 81.50% accuracy in a real-world outdoor wayfinding scenario. With an average runtime per frame of 1.61 ± 0.35 s, our approach stands as a robust solution for automatic fixation annotation.


Subject(s)
Eye-Tracking Technology , Fixation, Ocular , Humans , Fixation, Ocular/physiology , Video Recording/methods , Algorithms , Eye Movements/physiology
5.
Sensors (Basel) ; 24(8)2024 Apr 12.
Article in English | MEDLINE | ID: mdl-38676108

ABSTRACT

Egocentric activity recognition is a prominent computer vision task that is based on the use of wearable cameras. Since egocentric videos are captured through the perspective of the person wearing the camera, her/his body motions severely complicate the video content, imposing several challenges. In this work we propose a novel approach for domain-generalized egocentric human activity recognition. Typical approaches use a large amount of training data, aiming to cover all possible variants of each action. Moreover, several recent approaches have attempted to handle discrepancies between domains with a variety of costly and mostly unsupervised domain adaptation methods. In our approach we show that through simple manipulation of available source domain data and with minor involvement from the target domain, we are able to produce robust models, able to adequately predict human activity in egocentric video sequences. To this end, we introduce a novel three-stream deep neural network architecture combining elements of vision transformers and residual neural networks which are trained using multi-modal data. We evaluate the proposed approach using a challenging, egocentric video dataset and demonstrate its superiority over recent, state-of-the-art research works.


Subject(s)
Neural Networks, Computer , Video Recording , Humans , Video Recording/methods , Algorithms , Pattern Recognition, Automated/methods , Image Processing, Computer-Assisted/methods , Human Activities , Wearable Electronic Devices
6.
Sensors (Basel) ; 24(8)2024 Apr 19.
Article in English | MEDLINE | ID: mdl-38676235

ABSTRACT

Most human emotion recognition methods largely depend on classifying stereotypical facial expressions that represent emotions. However, such facial expressions do not necessarily correspond to actual emotional states and may correspond to communicative intentions. In other cases, emotions are hidden, cannot be expressed, or may have lower arousal manifested by less pronounced facial expressions, as may occur during passive video viewing. This study improves an emotion classification approach developed in a previous study, which classifies emotions remotely without relying on stereotypical facial expressions or contact-based methods, using short facial video data. In this approach, we desire to remotely sense transdermal cardiovascular spatiotemporal facial patterns associated with different emotional states and analyze this data via machine learning. In this paper, we propose several improvements, which include a better remote heart rate estimation via a preliminary skin segmentation, improvement of the heartbeat peaks and troughs detection process, and obtaining a better emotion classification accuracy by employing an appropriate deep learning classifier using an RGB camera input only with data. We used the dataset obtained in the previous study, which contains facial videos of 110 participants who passively viewed 150 short videos that elicited the following five emotion types: amusement, disgust, fear, sexual arousal, and no emotion, while three cameras with different wavelength sensitivities (visible spectrum, near-infrared, and longwave infrared) recorded them simultaneously. From the short facial videos, we extracted unique high-resolution spatiotemporal, physiologically affected features and examined them as input features with different deep-learning approaches. An EfficientNet-B0 model type was able to classify participants' emotional states with an overall average accuracy of 47.36% using a single input spatiotemporal feature map obtained from a regular RGB camera.


Subject(s)
Deep Learning , Emotions , Facial Expression , Heart Rate , Humans , Emotions/physiology , Heart Rate/physiology , Video Recording/methods , Image Processing, Computer-Assisted/methods , Face/physiology , Female , Male
7.
Sci Rep ; 14(1): 9481, 2024 04 25.
Article in English | MEDLINE | ID: mdl-38664466

ABSTRACT

In demersal trawl fisheries, the unavailability of the catch information until the end of the catching process is a drawback, leading to seabed impacts, bycatches and reducing the economic performance of the fisheries. The emergence of in-trawl cameras to observe catches in real-time can provide such information. This data needs to be processed in real-time to determine the catch compositions and rates, eventually improving sustainability and economic performance of the fisheries. In this study, a real-time underwater video processing system counting the Nephrops individuals entering the trawl has been developed using object detection and tracking methods on an edge device (NVIDIA Jetson AGX Orin). Seven state-of-the-art YOLO models were tested to discover the appropriate training settings and YOLO model. To achieve real-time processing and accurate counting simultaneously, four frame skipping ideas were evaluated. It has been shown that adaptive frame skipping approach, together with YOLOv8s model, can increase the processing speed up to 97.47 FPS while achieving correct count rate and F-score of 82.57% and 0.86, respectively. In conclusion, this system can improve the sustainability of the Nephrops directed trawl fishery by providing catch information in real-time.


Subject(s)
Fisheries , Animals , Video Recording/methods , Fishes/physiology , Image Processing, Computer-Assisted/methods , Algorithms , Models, Theoretical
8.
Neural Netw ; 175: 106319, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38640698

ABSTRACT

To enhance deep learning-based automated interictal epileptiform discharge (IED) detection, this study proposes a multimodal method, vEpiNet, that leverages video and electroencephalogram (EEG) data. Datasets comprise 24 931 IED (from 484 patients) and 166 094 non-IED 4-second video-EEG segments. The video data is processed by the proposed patient detection method, with frame difference and Simple Keypoints (SKPS) capturing patients' movements. EEG data is processed with EfficientNetV2. The video and EEG features are fused via a multilayer perceptron. We developed a comparative model, termed nEpiNet, to test the effectiveness of the video feature in vEpiNet. The 10-fold cross-validation was used for testing. The 10-fold cross-validation showed high areas under the receiver operating characteristic curve (AUROC) in both models, with a slightly superior AUROC (0.9902) in vEpiNet compared to nEpiNet (0.9878). Moreover, to test the model performance in real-world scenarios, we set a prospective test dataset, containing 215 h of raw video-EEG data from 50 patients. The result shows that the vEpiNet achieves an area under the precision-recall curve (AUPRC) of 0.8623, surpassing nEpiNet's 0.8316. Incorporating video data raises precision from 70% (95% CI, 69.8%-70.2%) to 76.6% (95% CI, 74.9%-78.2%) at 80% sensitivity and reduces false positives by nearly a third, with vEpiNet processing one-hour video-EEG data in 5.7 min on average. Our findings indicate that video data can significantly improve the performance and precision of IED detection, especially in prospective real clinic testing. It suggests that vEpiNet is a clinically viable and effective tool for IED analysis in real-world applications.


Subject(s)
Deep Learning , Electroencephalography , Epilepsy , Video Recording , Humans , Electroencephalography/methods , Video Recording/methods , Epilepsy/diagnosis , Epilepsy/physiopathology , Male , Female , Adult , Middle Aged , Adolescent , Neural Networks, Computer , Young Adult , Child
9.
J Cardiothorac Vasc Anesth ; 38(6): 1409-1416, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38503625

ABSTRACT

OBJECTIVE: The aim of this study was to use wearable video-recording technology to measure precisely the timing of discrete events during perioperative central venous catheter (CVC) placements. DESIGN: A single-center, observational, exploratory study on the use of wearable video-recording technology during intraoperative CVC placement. SETTING: The study was conducted at a University Hospital. PARTICIPANTS: Clinical anesthesia residents, cardiothoracic anesthesia fellows, and attending anesthesiologists participated in this study. INTERVENTIONS: Participants were asked to use eye-tracking glasses prior to the placement of a CVC in the cardiac operating rooms. No other instruction was given to the participants. MEASUREMENTS AND MAIN RESULTS: The authors measured the total time to complete the CVC placement, phase-specific time, and specific times of interest. They compared these times across 3 training levels and tested differences with analysis of variance. The authors' findings indicated significant differences in total CVC placement time when the procedure included a pulmonary artery catheter insertion (1,170 ± 364, 923 ± 272, and 596 ± 226 seconds; F2,63 = 12.71, p < 0.0001). Additionally, they found differences in interval times and times of interest. The authors observed a reduction of variability with increasing experience during the CVC placement phase. CONCLUSIONS: In this observational study, the study authors describe their experience using first-person wearable video-recording technology to precisely measure the timing of discrete events during CVC placement by anesthesia residents and anesthesiologists. Future work will leverage the eye-tracking capabilities of the existing hardware to identify areas of inefficiency to develop actionable targets for interventions that could improve trainee performance and patient safety.


Subject(s)
Catheterization, Central Venous , Operating Rooms , Video Recording , Humans , Video Recording/methods , Catheterization, Central Venous/methods , Catheterization, Central Venous/instrumentation , Wearable Electronic Devices , Cardiac Surgical Procedures/methods , Central Venous Catheters , Internship and Residency/methods , Male , Female , Anesthesiologists
10.
Epilepsia ; 65(5): 1176-1202, 2024 May.
Article in English | MEDLINE | ID: mdl-38426252

ABSTRACT

Computer vision (CV) shows increasing promise as an efficient, low-cost tool for video seizure detection and classification. Here, we provide an overview of the fundamental concepts needed to understand CV and summarize the structure and performance of various model architectures used in video seizure analysis. We conduct a systematic literature review of the PubMed, Embase, and Web of Science databases from January 1, 2000 to September 15, 2023, to identify the strengths and limitations of CV seizure analysis methods and discuss the utility of these models when applied to different clinical seizure phenotypes. Reviews, nonhuman studies, and those with insufficient or poor quality data are excluded from the review. Of the 1942 records identified, 45 meet inclusion criteria and are analyzed. We conclude that the field has shown tremendous growth over the past 2 decades, leading to several model architectures with impressive accuracy and efficiency. The rapid and scalable detection offered by CV models holds the potential to reduce sudden unexpected death in epilepsy and help alleviate resource limitations in epilepsy monitoring units. However, a lack of standardized, thorough validation measures and concerns about patient privacy remain important obstacles for widespread acceptance and adoption. Investigation into the performance of models across varied datasets from clinical and nonclinical environments is an essential area for further research.


Subject(s)
Seizures , Humans , Seizures/diagnosis , Seizures/classification , Electroencephalography/methods , Video Recording/methods
11.
Epilepsy Behav ; 154: 109735, 2024 May.
Article in English | MEDLINE | ID: mdl-38522192

ABSTRACT

Seizure events can manifest as transient disruptions in the control of movements which may be organized in distinct behavioral sequences, accompanied or not by other observable features such as altered facial expressions. The analysis of these clinical signs, referred to as semiology, is subject to observer variations when specialists evaluate video-recorded events in the clinical setting. To enhance the accuracy and consistency of evaluations, computer-aided video analysis of seizures has emerged as a natural avenue. In the field of medical applications, deep learning and computer vision approaches have driven substantial advancements. Historically, these approaches have been used for disease detection, classification, and prediction using diagnostic data; however, there has been limited exploration of their application in evaluating video-based motion detection in the clinical epileptology setting. While vision-based technologies do not aim to replace clinical expertise, they can significantly contribute to medical decision-making and patient care by providing quantitative evidence and decision support. Behavior monitoring tools offer several advantages such as providing objective information, detecting challenging-to-observe events, reducing documentation efforts, and extending assessment capabilities to areas with limited expertise. The main applications of these could be (1) improved seizure detection methods; (2) refined semiology analysis for predicting seizure type and cerebral localization. In this paper, we detail the foundation technologies used in vision-based systems in the analysis of seizure videos, highlighting their success in semiology detection and analysis, focusing on work published in the last 7 years. We systematically present these methods and indicate how the adoption of deep learning for the analysis of video recordings of seizures could be approached. Additionally, we illustrate how existing technologies can be interconnected through an integrated system for video-based semiology analysis. Each module can be customized and improved by adapting more accurate and robust deep learning approaches as these evolve. Finally, we discuss challenges and research directions for future studies.


Subject(s)
Deep Learning , Seizures , Video Recording , Humans , Seizures/diagnosis , Seizures/physiopathology , Video Recording/methods , Electroencephalography/methods
12.
IEEE J Biomed Health Inform ; 28(5): 3015-3028, 2024 May.
Article in English | MEDLINE | ID: mdl-38446652

ABSTRACT

The infant sleep-wake behavior is an essential indicator of physiological and neurological system maturity, the circadian transition of which is important for evaluating the recovery of preterm infants from inadequate physiological function and cognitive disorders. Recently, camera-based infant sleep-wake monitoring has been investigated, but the challenges of generalization caused by variance in infants and clinical environments are not addressed for this application. In this paper, we conducted a multi-center clinical trial at four hospitals to improve the generalization of camera-based infant sleep-wake monitoring. Using the face videos of 64 term and 39 preterm infants recorded in NICUs, we proposed a novel sleep-wake classification strategy, called consistent deep representation constraint (CDRC), that forces the convolutional neural network (CNN) to make consistent predictions for the samples from different conditions but with the same label, to address the variances caused by infants and environments. The clinical validation shows that by using CDRC, all CNN backbones obtain over 85% accuracy, sensitivity, and specificity in both the cross-age and cross-environment experiments, improving the ones without CDRC by almost 15% in all metrics. This demonstrates that by improving the consistency of the deep representation of samples with the same state, we can significantly improve the generalization of infant sleep-wake classification.


Subject(s)
Intensive Care Units, Neonatal , Sleep , Video Recording , Humans , Infant, Newborn , Video Recording/methods , Sleep/physiology , Monitoring, Physiologic/methods , Male , Female , Infant, Premature/physiology , Neural Networks, Computer , Wakefulness/physiology , Infant , Image Processing, Computer-Assisted/methods
13.
IEEE J Biomed Health Inform ; 28(5): 2955-2966, 2024 May.
Article in English | MEDLINE | ID: mdl-38345952

ABSTRACT

Video-based Photoplethysmography (VPPG) offers the capability to measure heart rate (HR) from facial videos. However, the reliability of the HR values extracted through this method remains uncertain, especially when videos are affected by various disturbances. Confronted by this challenge, we introduce an innovative framework for VPPG-based HR measurements, with a focus on capturing diverse sources of uncertainty in the predicted HR values. In this context, a neural network named HRUNet is structured for HR extraction from input facial videos. Departing from the conventional training approach of learning specific weight (and bias) values, we leverage the Bayesian posterior estimation to derive weight distributions within HRUNet. These distributions allow for sampling to encode uncertainty stemming from HRUNet's limited performance. On this basis, we redefine HRUNet's output as a distribution of potential HR values, as opposed to the traditional emphasis on the single most probable HR value. The underlying goal is to discover the uncertainty arising from inherent noise in the input video. HRUNet is evaluated across 1,098 videos from seven datasets, spanning three scenarios: undisturbed, motion-disturbed, and light-disturbed. The ensuing test outcomes demonstrate that uncertainty in the HR measurements increases significantly in the scenarios marked by disturbances, compared to that in the undisturbed scenario. Moreover, HRUNet outperforms state-of-the-art methods in HR accuracy when excluding HR values with 0.4 uncertainty. This underscores that uncertainty emerges as an informative indicator of potentially erroneous HR measurements. With enhanced reliability affirmed, the VPPG technique holds the promise for applications in safety-critical domains.


Subject(s)
Face , Heart Rate , Photoplethysmography , Signal Processing, Computer-Assisted , Video Recording , Humans , Heart Rate/physiology , Photoplethysmography/methods , Face/physiology , Video Recording/methods , Uncertainty , Neural Networks, Computer , Adult , Bayes Theorem , Male , Female , Young Adult , Image Processing, Computer-Assisted/methods , Algorithms , Reproducibility of Results
14.
IEEE J Biomed Health Inform ; 28(5): 2943-2954, 2024 May.
Article in English | MEDLINE | ID: mdl-38412077

ABSTRACT

In the fetal cardiac ultrasound examination, standard cardiac cycle (SCC) recognition is the essential foundation for diagnosing congenital heart disease. Previous studies have mostly focused on the detection of adult CCs, which may not be applicable to the fetus. In clinical practice, localization of SCCs needs to recognize end-systole (ES) and end-diastole (ED) frames accurately, ensuring that every frame in the cycle is a standard view. Most existing methods are not based on the detection of key anatomical structures, which may not recognize irrelevant views and background frames, results containing non-standard frames, or even it does not work in clinical practice. We propose an end-to-end hybrid neural network based on an object detector to detect SCCs from fetal ultrasound videos efficiently, which consists of 3 modules, namely Anatomical Structure Detection (ASD), Cardiac Cycle Localization (CCL), and Standard Plane Recognition (SPR). Specifically, ASD uses an object detector to identify 9 key anatomical structures, 3 cardiac motion phases, and the corresponding confidence scores from fetal ultrasound videos. On this basis, we propose a joint probability method in the CCL to learn the cardiac motion cycle based on the 3 cardiac motion phases. In SPR, to reduce the impact of structure detection errors on the accuracy of the standard plane recognition, we use XGBoost algorithm to learn the relation knowledge of the detected anatomical structures. We evaluate our method on the test fetal ultrasound video datasets and clinical examination cases and achieve remarkable results. This study may pave the way for clinical practices.


Subject(s)
Fetal Heart , Image Interpretation, Computer-Assisted , Neural Networks, Computer , Ultrasonography, Prenatal , Humans , Ultrasonography, Prenatal/methods , Female , Pregnancy , Image Interpretation, Computer-Assisted/methods , Fetal Heart/diagnostic imaging , Fetal Heart/physiology , Algorithms , Heart Defects, Congenital/diagnostic imaging , Video Recording/methods
15.
IEEE Trans Med Imaging ; 43(5): 1792-1803, 2024 May.
Article in English | MEDLINE | ID: mdl-38163305

ABSTRACT

Deep learning techniques have been investigated for the computer-aided diagnosis of thyroid nodules in ultrasound images. However, most existing thyroid nodule detection methods were simply based on static ultrasound images, which cannot well explore spatial and temporal information following the clinical examination process. In this paper, we propose a novel video-based semi-supervised framework for ultrasound thyroid nodule detection. Especially, considering clinical examinations that need to detect thyroid nodules at the ultrasonic probe positions, we first construct an adjacent frame guided detection backbone network by using adjacent supporting reference frames. To further reduce the labour-intensive thyroid nodule annotation in ultrasound videos, we extend the video-based detection in a semi-supervised manner by using both labeled and unlabeled videos. Based on the detection consistency in sequential neighbouring frames, a pseudo label adaptation strategy is proposed for the refinement of unpredicted frames. The proposed framework is validated on 996 transverse viewed and 1088 longitudinal viewed ultrasound videos. Experimental results demonstrated the superior performance of our proposed method in the ultrasound video-based detection of thyroid nodules.


Subject(s)
Deep Learning , Image Interpretation, Computer-Assisted , Thyroid Nodule , Ultrasonography , Video Recording , Thyroid Nodule/diagnostic imaging , Humans , Ultrasonography/methods , Image Interpretation, Computer-Assisted/methods , Video Recording/methods , Algorithms , Thyroid Gland/diagnostic imaging
16.
Seizure ; 115: 68-74, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38218112

ABSTRACT

PURPOSE: Drug-resistant epilepsy affects a substantial proportion (30-40 %) of patients with epilepsy, often necessitating video-electroencephalography (video-EEG) monitoring. In 2016, Sauro et al. introduced a set of measures aimed at improving the quality and safety indicators reported in video-EEG evaluations. This study aims to report our experience with the implementation of these measures. METHODS: We analyzed video-EEG data regarding quality and safty from a period spanning January 2016 to January 2018, involving a total of 101 patients monitored in our video-EEG unit. RESULTS: Among the patients included in the study, a definitive diagnosis was attainable for 92.1 %, with 36.6 % experiencing a change in diagnosis and 65.3 % undergoing a change in treatment as a result of the video-EEG evaluation. Additionally, the referral question was fully addressed in 60.4 % of admissions, and video-EEG was considered to be very useful or extremely useful in 66.4 % of cases. Adverse events were observed in 26.7 % of patients, with the most common being the progression of focal seizures to bilateral tonic-clonic seizures (11.9 %) and the occurrence of seizure clusters (5.9 %). CONCLUSION: Our findings support the implementation of Sauro et al.'s set of measures, as they provide valuable criteria for improving the reporting of video-EEG quality and safety indicators. However, challenges may arise due to variations in terminology across studies and the lack of standardized criteria for defining essential questions in video-EEG evaluations. Further research utilizing these measures is necessary to enhance their effectiveness and encourage consistent reporting of results from epilepsy monitoring units.


Subject(s)
Epilepsy , Quality Indicators, Health Care , Humans , Brazil , Video Recording/methods , Seizures/diagnosis , Seizures/etiology , Epilepsy/diagnosis , Epilepsy/etiology , Monitoring, Physiologic/methods , Electroencephalography/methods
17.
Int J Neural Syst ; 34(2): 2450005, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38063381

ABSTRACT

Autism Spectrum Disorder (ASD) is a complex and heterogeneous neurodevelopmental disorder which affects a significant proportion of the population, with estimates suggesting that about 1 in 100 children worldwide are affected by ASD. This study introduces a new Deep Neural Network for identifying ASD in children through gait analysis, using features extracted from frames composing video recordings of their walking patterns. The innovative method presented herein is based on imagery and combines gait analysis and deep learning, offering a noninvasive and objective assessment of neurodevelopmental disorders while delivering high accuracy in ASD detection. Our model proposes a bimodal approach based on the concatenation of two distinct Convolutional Neural Networks processing two feature sets extracted from the same videos. The features obtained from the convolutions of both networks are subsequently flattened and merged into a single vector, serving as input for the fully connected layers in the binary classification process. This approach demonstrates the potential for effective ASD detection in children through the combination of gait analysis and deep learning techniques.


Subject(s)
Autism Spectrum Disorder , Deep Learning , Child , Humans , Autism Spectrum Disorder/diagnosis , Neural Networks, Computer , Video Recording/methods
18.
Int J Speech Lang Pathol ; 26(2): 212-224, 2024 Apr.
Article in English | MEDLINE | ID: mdl-37294183

ABSTRACT

PURPOSE: This research investigates the relative effectiveness of independent online and blended learning approaches for novice analysts' development of videofluoroscopic swallowing study (VFSS) analytical skills. The secondary aims were to explore the impact of training on decision-making and to describe learners' perspectives of training outcomes. METHOD: Undergraduate speech-language pathology students (n = 74) who had completed the dysphagia academic curriculum in an undergraduate speech-language pathology program were recruited for a randomised control trial. The ability to identify swallowing impairments in adults was compared pre- and post-training across three conditions: independent online (n = 23), peer-supported (n = 23), and expert-facilitated training (n = 28). The training comprised online VFSS training and practice with a commercially available digital video disc (DVD). RESULT: The three training approaches were equal in improving novice analysts' identification of impairments on VFSS. Participants' analysis improved pre- to post-training (p = <.001), with no statistical difference amongst training conditions (p = .280). However, the expert facilitation condition resulted in better decision-making skill for novice analysts, as well as higher levels of confidence and greater engagement in the learning. CONCLUSION: Well-designed independent online methods are appropriate to prepare novice analysts for VFSS analytical training. Expert facilitation and peer-supported environments may have benefits for more advanced skill development and engagement, and should be investigated in future studies.


Subject(s)
Deglutition Disorders , Speech-Language Pathology , Adult , Humans , Deglutition , Speech , Test Taking Skills , Video Recording/methods , Speech-Language Pathology/methods
19.
Int J Speech Lang Pathol ; 26(2): 225-232, 2024 Apr.
Article in English | MEDLINE | ID: mdl-37403440

ABSTRACT

PURPOSE: With two-thirds of adults presenting for a videofluoroscopy swallow study (VFSS) with oesophageal abnormalities, it seems prudent to include visualisation of the oesophagus, in the context of the entire swallow process, to provide further information to the diagnostic team. This study aims to evaluate the ability of speech-language pathologists (SLPs) to interpret oesophageal sweep on VFSS and the relative improvement in that ability with additional training. METHOD: One hundred SLPs attended training in oesophageal visualisation during VFSS, based on a previous study. Ten oesophageal sweep videos (five normal, five abnormal) with one 20 ml thin fluid barium bolus (19% w/v) were presented at baseline and following training. Raters were blinded to patient information other than age. Binary ratings were collected for oesophageal transit time (OTT), presence of stasis, redirection, and referral to other specialists. RESULT: Inter-rater reliability as measured by Fleiss' kappa improved for all parameters, reaching statistical significance for OTT (pre-test kappa = 0.34, post-test kappa = 0.73; p < 0.01) and redirection (pre-test kappa = 0.38, post-test kappa = 0.49; p < 0.05). Overall agreement improved significantly (p < 0.001) for all parameters except stasis, where improvement was only slight. Interaction between pre-post and type of video (normal/abnormal) was statistically significant (p < 0.001) for redirection, with a large pre-post increase in positive accuracy compared with a slight pre-post decrease in negative accuracy. CONCLUSION: Findings indicate that SLPs require training to accurately interpret an oesophageal sweep on VFSS. This supports the inclusion of education and training on both normal and abnormal oesophageal sweep patterns, and the use of standardised protocols for clinicians using oesophageal visualisation as part of the VFSS protocol.


Subject(s)
Deglutition Disorders , Adult , Humans , Deglutition Disorders/diagnosis , Deglutition , Reproducibility of Results , Pathologists , Speech , Video Recording/methods
20.
Phlebology ; 39(1): 58-65, 2024 Feb.
Article in English | MEDLINE | ID: mdl-37902613

ABSTRACT

OBJECTIVE: YouTube® has gained popularity as an unofficial educational resource for surgical trainees, but its content's quality and educational value remain to be evaluated. The aim of this study is to analyze the current content on these techniques for lower extremity DVT (LEDVT) on YouTube®. METHODS: A search was performed on YouTube® using 13 search terms in August 2022 on a clear-cached browser. Open-access videos focusing on the surgical techniques of venous thrombolysis or thrombectomy for LEDVT were included. Quality and educational value were assessed and graded based on metrics for accountability (4 items), content (13 items), and production (9 items). RESULTS: Out of 138 videos regarding LEDVT oriented towards medical professionals, only 14 met inclusion criteria. Videos ran for a median of 3.4 min (range 0.37-35.6 min) with a median of 941 views (range 106-54624). Videos scored a median of 5.5 (range 1.0-8.0) out of 11 for content, a median of 2.0 out of 6.0 (range 0.0-2.0) for accountability, and a median of 5.5 out of 9.0 (range 3.0-9.0) for production. CONCLUSION: Few YouTube® videos focus on the technical aspects of DVT thrombolysis/thrombectomy, and they vary significantly in content with overall poor accountability and production quality.


Subject(s)
Social Media , Venous Thrombosis , Humans , Video Recording/methods , Veins , Venous Thrombosis/therapy , Thrombolytic Therapy
SELECTION OF CITATIONS
SEARCH DETAIL
...